Algorithms for Cost-Effective Video Compression

نویسنده

  • Angel DeCegama
چکیده

The volumes and costs of video storage and transmission are soaring. This situation can only be ameliorated by massive investments in infrastructure or by technological breakthroughs or both. This paper presents one such technological breakthrough that can reduce the size of any video file compressed by any existing video codec, e.g., MPEG-4, H.264, DivX, VC-1, etc., to between 25% and 10% of such compressed size without loss of the video quality resulting from the decompression and display by the codec of its compressed file and without any changes to the codec. The processing presented in this paper that achieves such results involves the preparation of the frames of the original video file before passing them on to a given codec. The codec then processes the received frames in its usual way to produce a much smaller compressed video file than without the initial pre-processing. The compressed video can then be stored and/or transmitted. For decompression and playback, the codec decompresses the compressed video frames in its usual way and then passes them onto the algorithm presented in this paper for final post-processing before display with a quality indistinguishable from that produced by the codec alone without using such preand post-processing. I. MATHEMATICAL BASIS The mathematical principles behind the algorithms presented here are those of the Wavelet Transform (WT). This is important because it has been demonstrated that human beings use the basic concepts of the WT to process in their brains all sensory information specially visual information that requires enormous amounts of compression. Such compresssion involves the discarding of all data that is irrelevant from the standpoint of human perception. There have been many attempts to apply the WT to the problem of video compression (4,5,6). The results have been encouraging showing better results than other existing codecs but not good enough to motivate the people working in this area to change their already set up procedures because of he computational complexity of WT methods needed to achieve high quality video with WT-based codecs. The approach presented in this paper is different. Do not change the codec. Pre-process and post-process the video. II. DETAILED DESCRIPTION In this approach, a crucial feature is the ability to recreate a given image or video frame from the lowfrequency component of its WT which is 1⁄4 the size of hte original image or video frame. This can be done precisely by applying the math of the direct WT and the IWT (inverse WT). In order to minimize computational complexity, the Haar WT can be used. The direct Haar WT low-frequency coefficients are a2 = 0.5 and a1 0.5 and the highfrequency coefficients are b2 = 0.5 and b1 = -0.5. The IWT low-frequency coefficients are aa2 = 1.0 and aa1 = 1.0 and the IWT high-frequency coefficients are bb2 = -1.0 and bb1 = 1.0. The WT is applied to the individual pixel rows and columns of a given image or video frame. This is done separately for the luminance (Y) and chrominance (U,V) components of the different pixels of each row and column. It can also be done for the R, G and B planes. Let’s define a set of yis to constitute the different values of one such component of a given row or column of an image or video frame. Let’s also define a set of xis to be the corresponding WT low-frequency values and a set of zis to be the corresponding WT high-frequency values. We can then write for the Haar WT (with decimation and now wraparound) The procedure for these calculations is shown in Fig. 1 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 3, March 2012) 450 Knowing both the xis and zis we can reconstruct exactly the yis by calculating the corresponding IWT. The above equations represent the IWT process shown in Figure 2. Assuming that y2n+1 is known, we can write Similar equations can be obtained by moving from top to bottom and from left to right. In other words, given the xis that are the low-frequency values of the decimated Haar WT of the yis, and given the very last value of the yis of a row or column of the original image or video frame, the yis values for the entire row or column can be found. Therefore, besides the xis values, one more value, the very last value of the yis, can also be stored to then be able to recreate precisely the entire original row or column. This is a negligible overhead when one considers that we are dealing with hundreds or even thousands of pixels for each row and column of images or video frames of typical applications. By applying such a procedure to every row and column of an image or video frame, the size is reduced to approximately 1⁄4 of the original that can be reproduced exactly from its reduced version. This process can be repeated on the reduced images or video frames for further size reductions of 1/16, 1/64, etc., of the original. Of course, this cannot be done indefinitely because the precision of the calculations must be limited in order to avoid increasing the required number of bits instead of reducing it and information is being diluted at each 1⁄4 reduction. However, extensive tests show that the quality of image reproduction is maintained up to 2 or 3 reduction levels with size reduction of up to 16 or 64 times before compression by the codec which is very significant in terms of video storage and transmission costs. The reproduction by any standard codec of the reduced size frame is precise enough to be able to apply the above calculations for recovery of the original full size frames with similar quality to that of the frames recovered by the codec without the initial size reduction step. In addition, the process can be further extended by using, after more than 2 or 3 reduction levels, any of the expansion filters disclosed in US Patent No. 7,317,840 [2] to enlarge very small images with high quality. Such procedures can be extended to other wavelets beyond the Haar Wavelet although the calculations are more complicated and time consuming. In this case, the corresponding equations for the WT and IWT lead to a sparse system of linear equations in which only a small number of its matrix elements are non-zero, resulting in a band diagonal matrix in which the width of the band depends on the number of Wavelet coefficients. There are software packages applicable to such systems, e.g., Yale Sparse Matrix Package, but the Haar method above provides the quality and speed that make such more complicated approaches unnecessary for situations in which real-time processing is an important requirement. International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 3, March 2012) 451 III. DESCRIPTION OF FRAME REDUCTION AND EXPANSION TECHNIQUES As above indicated, the application of the decimated Haar WT to a given video frame results in a frame that is 1⁄4 the original size because only the low-frequency Haar wavelet filter is applied. It has been proven above that the high-frequency Haar wavelet filter need not be applied if just the last original value before wavelet transformation of a row or column pixel is saved. With this information, all the preceding original pixels of a row or column can be calculated exactly. This process can be repeated again on the resulting reduced frame for additional sized reduction to 1/16 of the original and so on. This process is described in detail below. IV. ONE-LEVEL FRAME SIZE REDUCTION AND EXPANSION BACK TO THE ORIGINAL SIZE Figure 3 shows the process of reducing the size of a frame to about 1⁄4 of its original size. “A” is the original frame with dimensions x and y. The decimated low-pass Haar WT is applied to A horizontally resulting in frame “B’ of dimension (x/2) +1 and y. The last column of A, i.e., “LC”, is copied to the last column of B. Next, the decimated low-pass Haar WT is applied to the (x/2)+1 columns of B resulting in frame “C” of dimensions (x/2)+1 and (y/2)+1. Notice that pixel X (R, G, B, or Y, U, V component) is kept through this process. LR/2 is the decimated WT of LR and LC/2 is the decimated WT of LC. The process of recovering the original frame from C of Figure 3 is shown in Figure 4. First the last row of C is used to precisely recover the columns of B using the reconstruction algorithmic calculations disclosed above in this patent application. Finally, the reconstruction algorithm is applied to B horizontally starting with the values of LC reconstructed from the value of X using the reconstruction algorithm from right to left to recover A exactly. The procedure can be interfaced to any exiting codec, eg., MPEG-4, H.264, VC-1, DivX, etc, to improve its compression performance significantly (60% to 80% reduction in storage and transmission costs for all extensively tested video files) with no loss of video quality compared to that produced by the original codec after decompression. First, the size reduction process is applied to the original frames of a given video file. Then the codec is applied to such smaller frames to produce a much smaller file than without the size reduction step. The resulting compressed video file can then be stored and/or transmitted at a greatly reduced cost. For decompression and display, the codec is applied for decompression and then the above frame expansion procedure is used prior to displaying high-quality video in its original full-size Because of the lossy compression of existing standard video codecs, there is some minor loss of video quality compared to the original before compression by the codec but the algorithm described here does not result in any perceived degradation of quality when compared to that produced by the codec on its own from a much larger file. V. MULTIPLE-LEVEL FRAME SIZE REDUCTION AND EXPANSION The process described in Figures 3 and 4 can be continued one or more levels starting with the C frame instead of the A frame. There are additional right columns and bottom rows to be saved but they are one half the sizes of the previous level and, consequently, they don’t appreciably detract from the saving in storage and transmission bandwidth. Figure 5 shows the process of reducing the frame size one more level. Figure 6 shows the expansion by one-level of a twolevel sized reduction. The original full size can then be recovered by the process of Figure 4 Figure 7 shows the process of going from 2-Level size reduction to 3-Level size reduction and Figure 8 shows the profess of expansion from 3-Level size reduction to 2Level size reduction. Additional levels of reduction can be handled similarly. VI. MODES OF OPERATION The above ideas and algorithms can be implemented in a number of different ways.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fast Block Size Decision For Intra Coding in HEVC Standard

Intra coding in High efficiency video coding (HEVC) can significantly improve the compression efficiency using 35 intra-prediction modes for 2N×2N (N is an integer number ranging from six to two) luma blocks. To find the luma block with the minimum rate-distortion, it must perform 11932 different rate-distortion cost calculations. Although this approach improves coding efficiency compared to th...

متن کامل

A Fast Block Size Decision For Intra Coding in HEVC Standard

Intra coding in High efficiency video coding (HEVC) can significantly improve the compression efficiency using 35 intra-prediction modes for 2N×2N (N is an integer number ranging from six to two) luma blocks. To find the luma block with the minimum rate-distortion, it must perform 11932 different rate-distortion cost calculations. Although this approach improves coding efficiency compared to th...

متن کامل

Compressed-Domain Video Processing

compressed domain processing, transcoding, video editing, MPEG, splicing, reverse play, frame rate conversion, interlace to progressive, motion vector resampling Video compression algorithms are being used to compress digital video for a wide variety of applications, including video delivery over the Internet, advanced television broadcasting, video streaming, video conferencing, and video stor...

متن کامل

Employing a novel cross-diamond search in a modified hierarchical search motion estimation algorithm for video compression

The large amount of bandwidth that is required for the transmission or storage of digital videos is the main incentive for researchers to develop algorithms that aim at compressing video data (digital images) whilst keeping their quality as high as possible. Motion estimation algorithms are used for video compression as they reduce the memory requirements of any video file while maintaining its...

متن کامل

Low-Cost Super-Resolution Algorithms Implementation Over a HW/SW Video Compression Platform

Two approaches are presented in this paper to improve the quality of digital images over the sensor resolution using superresolution techniques: iterative super-resolution (ISR) and noniterative super-resolution (NISR) algorithms. The results show important improvements in the image quality, assuming that sufficient sample data and a reasonable amount of aliasing are available at the input imag...

متن کامل

Image and Video Coding/Transcoding: A Rate Distortion Approach

Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for explor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012